Dataset statistics
| Number of variables | 7 |
|---|---|
| Number of observations | 1982 |
| Missing cells | 11 |
| Missing cells (%) | 0.1% |
| Duplicate rows | 38 |
| Duplicate rows (%) | 1.9% |
| Total size in memory | 108.5 KiB |
| Average record size in memory | 56.1 B |
Variable types
| Numeric | 6 |
|---|---|
| DateTime | 1 |
| Dataset has 38 (1.9%) duplicate rows | Duplicates |
sales is highly skewed (γ1 = 43.05439499) | Skewed |
sales has 549 (27.7%) zeros | Zeros |
Reproduction
| Analysis started | 2024-03-20 18:19:40.214139 |
|---|---|
| Analysis finished | 2024-03-20 18:24:10.244842 |
| Duration | 4 minutes and 30.03 seconds |
| Software version | ydata-profiling vv4.2.0 |
| Download configuration | config.json |
store_location_key
Real number (ℝ)
| Distinct | 36 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6622.5439 |
| Minimum | 1396 |
|---|---|
| Maximum | 9807 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 1396 |
|---|---|
| 5-th percentile | 1396 |
| Q1 | 6973 |
| median | 7296 |
| Q3 | 8142 |
| 95-th percentile | 9604 |
| Maximum | 9807 |
| Range | 8411 |
| Interquartile range (IQR) | 1169 |
Descriptive statistics
| Standard deviation | 2463.5767 |
|---|---|
| Coefficient of variation (CV) | 0.37199854 |
| Kurtosis | 0.32819427 |
| Mean | 6622.5439 |
| Median Absolute Deviation (MAD) | 846 |
| Skewness | -1.2252283 |
| Sum | 13125882 |
| Variance | 6069210.1 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=36)
| Value | Count | Frequency (%) |
| 8142 | 526 | |
| 6973 | 321 | |
| 7296 | 277 | |
| 1396 | 249 | |
| 9604 | 141 | 7.1% |
| 4823 | 95 | 4.8% |
| 7104 | 65 | 3.3% |
| 1891 | 58 | 2.9% |
| 9807 | 52 | 2.6% |
| 7167 | 38 | 1.9% |
| Other values (26) | 160 | 8.1% |
| Value | Count | Frequency (%) |
| 1396 | 249 | |
| 1842 | 4 | 0.2% |
| 1891 | 58 | 2.9% |
| 2063 | 6 | 0.3% |
| 2428 | 7 | 0.4% |
| 4823 | 95 | 4.8% |
| 4861 | 1 | 0.1% |
| 6905 | 14 | 0.7% |
| 6941 | 2 | 0.1% |
| 6946 | 8 | 0.4% |
| Value | Count | Frequency (%) |
| 9807 | 52 | 2.6% |
| 9802 | 3 | 0.2% |
| 9604 | 141 | 7.1% |
| 8207 | 1 | 0.1% |
| 8187 | 4 | 0.2% |
| 8161 | 2 | 0.1% |
| 8142 | 526 | |
| 8110 | 1 | 0.1% |
| 7317 | 25 | 1.3% |
| 7313 | 2 | 0.1% |
product_key
Real number (ℝ)
| Distinct | 811 |
|---|---|
| Distinct (%) | 40.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.9398553 × 1014 |
| Minimum | 8.1000023 × 108 |
|---|---|
| Maximum | 1 × 1015 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 8.1000023 × 108 |
|---|---|
| 5-th percentile | 5.557742 × 109 |
| Q1 | 7.0942302 × 109 |
| median | 8.8358363 × 1010 |
| Q3 | 1 × 1015 |
| 95-th percentile | 1 × 1015 |
| Maximum | 1 × 1015 |
| Range | 9.9999919 × 1014 |
| Interquartile range (IQR) | 9.9999291 × 1014 |
Descriptive statistics
| Standard deviation | 5.0005011 × 1014 |
|---|---|
| Coefficient of variation (CV) | 1.0122769 |
| Kurtosis | -2.001431 |
| Mean | 4.9398553 × 1014 |
| Median Absolute Deviation (MAD) | 8.7437863 × 1010 |
| Skewness | 0.024236363 |
| Sum | 9.7907931 × 1017 |
| Variance | 2.5005012 × 1029 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 × 1015 | 521 | |
| 1 × 1015 | 285 | 14.4% |
| 1 × 1015 | 142 | 7.2% |
| 7.710588004 × 1010 | 21 | 1.1% |
| 1 × 1015 | 18 | 0.9% |
| 8.04906003 × 1010 | 8 | 0.4% |
| 7.710588004 × 1010 | 7 | 0.4% |
| 4.160000017 × 1010 | 7 | 0.4% |
| 4.141000022 × 1010 | 6 | 0.3% |
| 7.710589 × 1010 | 5 | 0.3% |
| Other values (801) | 962 |
| Value | Count | Frequency (%) |
| 810000231 | 1 | |
| 810000394 | 1 | |
| 810000413 | 1 | |
| 912848856 | 1 | |
| 928151100 | 1 | |
| 1125000024 | 1 | |
| 1150900378 | 1 | |
| 1150901811 | 1 | |
| 1204403889 | 1 | |
| 1660000087 | 1 |
| Value | Count | Frequency (%) |
| 1 × 1015 | 285 | |
| 1 × 1015 | 1 | 0.1% |
| 1 × 1015 | 521 | |
| 1 × 1015 | 5 | 0.3% |
| 1 × 1015 | 18 | 0.9% |
| 1 × 1015 | 142 | 7.2% |
| 1 × 1015 | 3 | 0.2% |
| 1 × 1015 | 2 | 0.1% |
| 1 × 1015 | 1 | 0.1% |
| 1 × 1015 | 1 | 0.1% |
collector_key
Real number (ℝ)
| Distinct | 356 |
|---|---|
| Distinct (%) | 18.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.736261 × 1010 |
| Minimum | -1 |
|---|---|
| Maximum | 1.4785035 × 1011 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1589 |
| Negative (%) | 80.2% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | -1 |
| median | -1 |
| Q3 | -1 |
| 95-th percentile | 1.4113287 × 1011 |
| Maximum | 1.4785035 × 1011 |
| Range | 1.4785035 × 1011 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 5.5049437 × 1010 |
|---|---|
| Coefficient of variation (CV) | 2.0118489 |
| Kurtosis | 0.30419905 |
| Mean | 2.736261 × 1010 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.5167011 |
| Sum | 5.4232692 × 1013 |
| Variance | 3.0304405 × 1021 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -1 | 1589 | |
| 1.345155956 × 1011 | 5 | 0.3% |
| 1.3947 × 1011 | 4 | 0.2% |
| 1.37314 × 1011 | 3 | 0.2% |
| 1.39447 × 1011 | 3 | 0.2% |
| 1.37274 × 1011 | 2 | 0.1% |
| 1.372744939 × 1011 | 2 | 0.1% |
| 1.373212129 × 1011 | 2 | 0.1% |
| 1.34527 × 1011 | 2 | 0.1% |
| 1.39481 × 1011 | 2 | 0.1% |
| Other values (346) | 368 | 18.6% |
| Value | Count | Frequency (%) |
| -1 | 1589 | |
| 1.34401893 × 1011 | 1 | 0.1% |
| 1.34405 × 1011 | 1 | 0.1% |
| 1.344085042 × 1011 | 1 | 0.1% |
| 1.3440982 × 1011 | 1 | 0.1% |
| 1.34410033 × 1011 | 1 | 0.1% |
| 1.344130483 × 1011 | 1 | 0.1% |
| 1.34415 × 1011 | 1 | 0.1% |
| 1.344522226 × 1011 | 1 | 0.1% |
| 1.344524125 × 1011 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 1.478503527 × 1011 | 1 | |
| 1.478414164 × 1011 | 1 | |
| 1.47840146 × 1011 | 1 | |
| 1.428152676 × 1011 | 1 | |
| 1.428146626 × 1011 | 1 | |
| 1.428140354 × 1011 | 2 | |
| 1.428111318 × 1011 | 1 | |
| 1.42811 × 1011 | 1 | |
| 1.428106383 × 1011 | 1 | |
| 1.428071675 × 1011 | 1 |
trans_dt
Date
| Distinct | 331 |
|---|---|
| Distinct (%) | 16.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 KiB |
| Minimum | 2015-03-31 00:00:00 |
|---|---|
| Maximum | 2016-10-21 00:00:00 |
Histogram with fixed size bins (bins=50)
sales
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 448 |
|---|---|
| Distinct (%) | 22.7% |
| Missing | 7 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20.145732 |
| Minimum | -62.3 |
|---|---|
| Maximum | 10299.58 |
| Zeros | 549 |
| Zeros (%) | 27.7% |
| Negative | 6 |
| Negative (%) | 0.3% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | -62.3 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 7.08 |
| Q3 | 17.155 |
| 95-th percentile | 55.158 |
| Maximum | 10299.58 |
| Range | 10361.88 |
| Interquartile range (IQR) | 17.155 |
Descriptive statistics
| Standard deviation | 233.93189 |
|---|---|
| Coefficient of variation (CV) | 11.611983 |
| Kurtosis | 1891.5499 |
| Mean | 20.145732 |
| Median Absolute Deviation (MAD) | 7.08 |
| Skewness | 43.054395 |
| Sum | 39787.82 |
| Variance | 54724.127 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 549 | |
| 7.32 | 54 | 2.7% |
| 3.19 | 30 | 1.5% |
| 8.9 | 30 | 1.5% |
| 5.32 | 26 | 1.3% |
| 1.78 | 26 | 1.3% |
| 20.45 | 24 | 1.2% |
| 4.97 | 22 | 1.1% |
| 9.77 | 22 | 1.1% |
| 8.88 | 22 | 1.1% |
| Other values (438) | 1170 |
| Value | Count | Frequency (%) |
| -62.3 | 1 | 0.1% |
| -55.14 | 1 | 0.1% |
| -40.92 | 1 | 0.1% |
| -39.39 | 1 | 0.1% |
| -35.58 | 1 | 0.1% |
| -9.49 | 1 | 0.1% |
| 0 | 549 | |
| 0.02 | 2 | 0.1% |
| 0.09 | 6 | 0.3% |
| 0.18 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 10299.58 | 1 | |
| 748.99 | 1 | |
| 424.19 | 1 | |
| 389.27 | 1 | |
| 381.17 | 1 | |
| 372.45 | 1 | |
| 332.09 | 1 | |
| 268.83 | 1 | |
| 251.89 | 1 | |
| 240.03 | 1 |
units
Real number (ℝ)
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 4 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.1314459 |
| Minimum | -2 |
|---|---|
| Maximum | 18 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 17 |
| Negative (%) | 0.9% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | -2 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 18 |
| Range | 20 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.75462629 |
|---|---|
| Coefficient of variation (CV) | 0.66695746 |
| Kurtosis | 149.25188 |
| Mean | 1.1314459 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 9.0107175 |
| Sum | 2238 |
| Variance | 0.56946083 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=11)
| Value | Count | Frequency (%) |
| 1 | 1801 | |
| 2 | 106 | 5.3% |
| 3 | 22 | 1.1% |
| 4 | 16 | 0.8% |
| -1 | 15 | 0.8% |
| 7 | 7 | 0.4% |
| 5 | 5 | 0.3% |
| -2 | 2 | 0.1% |
| 8 | 2 | 0.1% |
| 6 | 1 | 0.1% |
| (Missing) | 4 | 0.2% |
| Value | Count | Frequency (%) |
| -2 | 2 | 0.1% |
| -1 | 15 | 0.8% |
| 1 | 1801 | |
| 2 | 106 | 5.3% |
| 3 | 22 | 1.1% |
| 4 | 16 | 0.8% |
| 5 | 5 | 0.3% |
| 6 | 1 | 0.1% |
| 7 | 7 | 0.4% |
| 8 | 2 | 0.1% |
| Value | Count | Frequency (%) |
| 18 | 1 | 0.1% |
| 8 | 2 | 0.1% |
| 7 | 7 | 0.4% |
| 6 | 1 | 0.1% |
| 5 | 5 | 0.3% |
| 4 | 16 | 0.8% |
| 3 | 22 | 1.1% |
| 2 | 106 | 5.3% |
| 1 | 1801 | |
| -1 | 15 | 0.8% |
trans_key
Real number (ℝ)
| Distinct | 1927 |
|---|---|
| Distinct (%) | 97.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.4775147 × 1025 |
| Minimum | 9.6072966 × 1021 |
|---|---|
| Maximum | 9.3065169 × 1026 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 KiB |
Quantile statistics
| Minimum | 9.6072966 × 1021 |
|---|---|
| 5-th percentile | 6.628896 × 1023 |
| Q1 | 3.6672232 × 1024 |
| median | 3.0786285 × 1025 |
| Q3 | 6.1049635 × 1025 |
| 95-th percentile | 1.7641637 × 1026 |
| Maximum | 9.3065169 × 1026 |
| Range | 9.3064208 × 1026 |
| Interquartile range (IQR) | 5.7382412 × 1025 |
Descriptive statistics
| Standard deviation | 1.0714889 × 1026 |
|---|---|
| Coefficient of variation (CV) | 1.9561589 |
| Kurtosis | 27.81512 |
| Mean | 5.4775147 × 1025 |
| Median Absolute Deviation (MAD) | 2.743367 × 1025 |
| Skewness | 4.7081981 |
| Sum | 1.0856434 × 1029 |
| Variance | 1.1480885 × 1052 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 6.750796041 × 1024 | 4 | 0.2% |
| 6.836096041 × 1023 | 3 | 0.2% |
| 6.869096041 × 1024 | 3 | 0.2% |
| 6.723496041 × 1024 | 3 | 0.2% |
| 6.130311396 × 1025 | 2 | 0.1% |
| 6.535696041 × 1024 | 2 | 0.1% |
| 3.18016 × 1024 | 2 | 0.1% |
| 1.715488142 × 1026 | 2 | 0.1% |
| 6.736296041 × 1024 | 2 | 0.1% |
| 3.161516973 × 1025 | 2 | 0.1% |
| Other values (1917) | 1957 |
| Value | Count | Frequency (%) |
| 9.607296589 × 1021 | 1 | |
| 1.957167118 × 1022 | 1 | |
| 2.227296589 × 1022 | 1 | |
| 2.527167118 × 1022 | 1 | |
| 2.667167118 × 1022 | 1 | |
| 3.303729659 × 1022 | 1 | |
| 3.707167118 × 1022 | 1 | |
| 3.80717 × 1022 | 1 | |
| 4.00717 × 1022 | 1 | |
| 4.10771756 × 1022 | 1 |
| Value | Count | Frequency (%) |
| 9.306516905 × 1026 | 1 | |
| 9.280676905 × 1026 | 1 | |
| 9.224066905 × 1026 | 1 | |
| 9.199276905 × 1026 | 1 | |
| 9.191826905 × 1026 | 1 | |
| 9.173326905 × 1026 | 1 | |
| 9.129276905 × 1026 | 1 | |
| 7.955687226 × 1026 | 1 | |
| 7.860737226 × 1026 | 1 | |
| 7.818451891 × 1026 | 1 |
| store_location_key | product_key | collector_key | sales | units | trans_key | |
|---|---|---|---|---|---|---|
| store_location_key | 1.000 | 0.047 | -0.159 | -0.006 | -0.044 | -0.210 |
| product_key | 0.047 | 1.000 | -0.071 | -0.048 | -0.045 | 0.272 |
| collector_key | -0.159 | -0.071 | 1.000 | 0.112 | 0.023 | 0.061 |
| sales | -0.006 | -0.048 | 0.112 | 1.000 | 0.089 | -0.070 |
| units | -0.044 | -0.045 | 0.023 | 0.089 | 1.000 | 0.005 |
| trans_key | -0.210 | 0.272 | 0.061 | -0.070 | 0.005 | 1.000 |
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
| store_location_key | product_key | collector_key | trans_dt | sales | units | trans_key | |
|---|---|---|---|---|---|---|---|
| 0 | 9604 | 999999999999513 | -1 | 2015-10-22 | 0.00 | 1 | 651769604118220151022933 |
| 1 | 1396 | 999999999999134 | -1 | 2015-11-23 | 7.10 | 1 | 62432813961182201511231134 |
| 2 | 4823 | 6081506203 | 134585287507 | 2015-04-15 | 25.79 | 1 | 2841948231182201504151317 |
| 3 | 1396 | 77105880035 | -1 | 2015-11-09 | 1.76 | 1 | 62291113961182201511091111 |
| 4 | 9604 | 999999999999513 | -1 | 2015-12-30 | 0.00 | 1 | 680199604118220151230855 |
| 5 | 9807 | 77105810740 | -1 | 2015-09-29 | 15.11 | 1 | 2056098075671201509291354 |
| 6 | 1396 | 8390000626 | -1 | 2015-05-21 | 4.97 | 1 | 60529413961182201505211843 |
| 7 | 9604 | 999999999999513 | -1 | 2015-12-18 | 0.00 | 1 | 6765796041182201512181212 |
| 8 | 4823 | 1660000087 | -1 | 2015-09-17 | 1.58 | 1 | 3504948231182201509171216 |
| 9 | 1396 | 999999999999513 | -1 | 2015-11-02 | 7.32 | 1 | 62221313961182201511021422 |
| store_location_key | product_key | collector_key | trans_dt | sales | units | trans_key | |
|---|---|---|---|---|---|---|---|
| 1972 | 7104 | 5847810237 | -1 | 2015-11-13 | 74.74 | 1 | 10928371041182201511131741 |
| 1973 | 7296 | 6334820448 | 137262865317 | 2015-10-16 | 5.86 | 1 | 55923729622282201510162142 |
| 1974 | 7296 | 6230070609 | -1 | 2015-11-08 | 12.23 | 3 | 60641729623570201511081447 |
| 1975 | 7296 | 999999999999200 | 139683045874 | 2015-12-08 | 14.24 | 2 | 66082729622282201512081502 |
| 1976 | 7296 | 6340002130 | -1 | 2015-11-14 | 5.32 | 1 | 61759729622967201511141724 |
| 1977 | 7296 | 999999999999142 | -1 | 2015-12-26 | 0.00 | 1 | 69280729623570201512261217 |
| 1978 | 7104 | 4155405415 | -1 | 2015-05-23 | 20.45 | 1 | 394837710421852201505231941 |
| 1979 | 7104 | 3607342350960 | 141339411236 | 2015-10-20 | 14.76 | 1 | 29907710422739201510201425 |
| 1980 | 7296 | 3300000200 | 140195131255 | 2016-01-08 | 20.45 | 1 | 71511729622282201601081509 |
| 1981 | 7104 | 6800079282 | 134546809449 | 2015-07-21 | 3.19 | 1 | 408498710422755201507211639 |
Most frequently occurring
| store_location_key | product_key | collector_key | trans_dt | sales | units | trans_key | # duplicates | |
|---|---|---|---|---|---|---|---|---|
| 25 | 9604 | 999999999999513 | -1 | 2015-12-15 | 0.00 | 1 | 6750796041182201512151016 | 4 |
| 22 | 9604 | 999999999999513 | -1 | 2015-12-08 | 0.00 | 1 | 6723496041182201512081113 | 3 |
| 31 | 9604 | 999999999999513 | -1 | 2016-01-08 | 0.00 | 1 | 683609604118220160108718 | 3 |
| 33 | 9604 | 999999999999513 | -1 | 2016-01-18 | 0.00 | 1 | 6869096041182201601181002 | 3 |
| 0 | 1396 | 999999999999513 | -1 | 2015-11-27 | 0.00 | 1 | 62484013961182201511271124 | 2 |
| 1 | 1396 | 999999999999513 | -1 | 2015-11-30 | 7.32 | 1 | 62505713961182201511301012 | 2 |
| 2 | 4823 | 999999999999513 | -1 | 2015-10-23 | 3.54 | 1 | 3658948231182201510231153 | 2 |
| 3 | 4823 | 999999999999513 | -1 | 2015-12-02 | 0.00 | 1 | 3826648231182201512021252 | 2 |
| 4 | 6973 | 1E+15 | 1.39481E+11 | 12/23/2015 | 22.73 | 1 | 3.18016E+24 | 2 |
| 5 | 6973 | 999999999999513 | -1 | 2015-11-06 | 0.00 | 1 | 31615169731182201511061035 | 2 |